352 research outputs found

    The effect of minor allele frequency on the likelihood of obtaining false positives

    Get PDF
    Determining the most promising single-nucleotide polymorphisms (SNPs) presents a challenge in genome-wide association studies, when hundreds of thousands of association tests are conducted. The power to detect genetic effects is dependent on minor allele frequency (MAF), and genome-wide association studies SNP arrays include SNPs with a wide distribution of MAFs. Therefore, it is critical to understand MAF's effect on the false positive rate

    Evaluation of exposure-specific risks from two independent samples: A simulation study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Previous studies have proposed a simple product-based estimator for calculating exposure-specific risks (ESR), but the methodology has not been rigorously evaluated. The goal of our study was to evaluate the existing methodology for calculating the ESR, propose an improved point estimator, and propose variance estimates that will allow the calculation of confidence intervals (CIs).</p> <p>Methods</p> <p>We conducted a simulation study to test the performance of two estimators and their associated confidence intervals: 1) current (simple product-based estimator) and 2) proposed revision (revised product-based estimator). The first method for ESR estimation was based on multiplying a relative risk (RR) of disease given a certain exposure by an overall risk of disease. The second method, which is proposed in this paper, was based on estimates of the risk of disease in the unexposed. We then multiply the updated risk by the RR to get the revised product-based estimator. A log-based variance was calculated for both estimators. Also, a binomial-based variance was calculated for the revised product-based estimator. 95% CIs were calculated based on these variance estimates. Accuracy of point estimators was evaluated by comparing observed relative bias (percent deviation from the true estimate). Interval estimators were evaluated by coverage probabilities and expected length of the 95% CI, given coverage. We evaluated these estimators across a wide range of exposure probabilities, disease probabilities, relative risks, and sample sizes.</p> <p>Results</p> <p>We observed more bias and lower coverage probability when using the existing methodology. The revised product-based point estimator exhibited little observed relative bias (max: 4.0%) compared to the simple product-based estimator (max: 93.9%). Because the simple product-based estimator was biased, 95% CIs around this estimate exhibited small coverage probabilities. The 95% CI around the revised product-based estimator from the log-based variance provided better coverage in most situations.</p> <p>Conclusion</p> <p>The currently accepted simple product-based method was only a reasonable approach when the exposure probability is small (< 0.05) and the RR is ≤ 3.0. The revised product-based estimator provides much improved accuracy.</p

    Framingham Heart Study 100K project: genome-wide associations for cardiovascular disease outcomes

    Get PDF
    BACKGROUND:Cardiovascular disease (CVD) and its most common manifestations - including coronary heart disease (CHD), stroke, heart failure (HF), and atrial fibrillation (AF) - are major causes of morbidity and mortality. In many industrialized countries, cardiovascular disease (CVD) claims more lives each year than any other disease. Heart disease and stroke are the first and third leading causes of death in the United States. Prior investigations have reported several single gene variants associated with CHD, stroke, HF, and AF. We report a community-based genome-wide association study of major CVD outcomes.METHODS:In 1345 Framingham Heart Study participants from the largest 310 pedigrees (54% women, mean age 33 years at entry), we analyzed associations of 70,987 qualifying SNPs (Affymetrix 100K GeneChip) to four major CVD outcomes: major atherosclerotic CVD (n = 142; myocardial infarction, stroke, CHD death), major CHD (n = 118; myocardial infarction, CHD death), AF (n = 151), and HF (n = 73). Participants free of the condition at entry were included in proportional hazards models. We analyzed model-based deviance residuals using generalized estimating equations to test associations between SNP genotypes and traits in additive genetic models restricted to autosomal SNPs with minor allele frequency [greater than or equal to]0.10, genotype call rate [greater than or equal to]0.80, and Hardy-Weinberg equilibrium p-value [greater than or equal to] 0.001.RESULTS:Six associations yielded p <10-5. The lowest p-values for each CVD trait were as follows: major CVD, rs499818, p = 6.6 x 10-6; major CHD, rs2549513, p = 9.7 x 10-6; AF, rs958546, p = 4.8 x 10-6; HF: rs740363, p = 8.8 x 10-6. Of note, we found associations of a 13 Kb region on chromosome 9p21 with major CVD (p 1.7 - 1.9 x 10-5) and major CHD (p 2.5 - 3.5 x 10-4) that confirm associations with CHD in two recently reported genome-wide association studies. Also, rs10501920 in CNTN5 was associated with AF (p = 9.4 x 10-6) and HF (p = 1.2 x 10-4). Complete results for these phenotypes can be found at the dbgap website http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007.CONCLUSION:No association attained genome-wide significance, but several intriguing findings emerged. Notably, we replicated associations of chromosome 9p21 with major CVD. Additional studies are needed to validate these results. Finding genetic variants associated with CVD may point to novel disease pathways and identify potential targeted preventive therapies

    Genome-Wide Association of Pericardial Fat Identifies a Unique Locus for Ectopic Fat

    Get PDF
    Pericardial fat is a localized fat depot associated with coronary artery calcium and myocardial infarction. We hypothesized that genetic loci would be associated with pericardial fat independent of other body fat depots. Pericardial fat was quantified in 5,487 individuals of European ancestry from the Framingham Heart Study (FHS) and the Multi-Ethnic Study of Atherosclerosis (MESA). Genotyping was performed using standard arrays and imputed to ∼2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of pericardial fat adjusted for age, sex, weight, and height. A weighted z-score meta-analysis was conducted, and validation was obtained in an additional 3,602 multi-ethnic individuals from the MESA study. We identified a genome-wide significant signal in our primary meta-analysis at rs10198628 near TRIB2 (MAF 0.49, p = 2.7×10-08). This SNP was not associated with visceral fat (p = 0.17) or body mass index (p = 0.38), although we observed direction-consistent, nominal significance with visceral fat adjusted for BMI (p = 0.01) in the Framingham Heart Study. Our findings were robust among African ancestry (n = 1,442, p = 0.001), Hispanic (n = 1,399, p = 0.004), and Chinese (n = 761, p = 0.007) participants from the MESA study, with a combined p-value of 5.4E-14. We observed TRIB2 gene expression in the pericardial fat of mice. rs10198628 near TRIB2 is associated with pericardial fat but not measures of generalized or visceral adiposity, reinforcing the concept that there are unique genetic underpinnings to ectopic fat distribution

    Multiancestry Study of Gene-Lifestyle Interactions for Cardiovascular Traits in 610 475 Individuals From 124 Cohorts Design and Rationale

    Get PDF
    Background— Several consortia have pursued genome-wide association studies for identifying novel genetic loci for blood pressure, lipids, hypertension, etc. They demonstrated the power of collaborative research through meta-analysis of study-specific results. Methods and Results— The Gene-Lifestyle Interactions Working Group was formed to facilitate the first large, concerted, multiancestry study to systematically evaluate gene–lifestyle interactions. In stage 1, genome-wide interaction analysis is performed in 53 cohorts with a total of 149 684 individuals from multiple ancestries. In stage 2 involving an additional 71 cohorts with 460 791 individuals from multiple ancestries, focused analysis is performed for a subset of the most promising variants from stage 1. In all, the study involves up to 610 475 individuals. Current focus is on cardiovascular traits including blood pressure and lipids, and lifestyle factors including smoking, alcohol, education (as a surrogate for socioeconomic status), physical activity, psychosocial variables, and sleep. The total sample sizes vary among projects because of missing data. Large-scale gene–lifestyle or more generally gene–environment interaction (G×E) meta-analysis studies can be cumbersome and challenging. This article describes the design and some of the approaches pursued in the interaction projects. Conclusions— The Gene-Lifestyle Interactions Working Group provides an excellent framework for understanding the lifestyle context of genetic effects and to identify novel trait loci through analysis of interactions. An important and novel feature of our study is that the gene–lifestyle interaction (G×E) results may improve our knowledge about the underlying mechanisms for novel and already known trait loci

    Nonsteroidal anti-inflammatory drug use and Alzheimer's disease risk: the MIRAGE Study

    Get PDF
    BACKGROUND: Nonsteroidal anti-inflammatory drugs (NSAID) use may protect against Alzheimer's disease (AD) risk. We sought examine the association between NSAID use and risk of AD, and potential effect modification by APOE-ε4 carrier status and ethnicity. METHODS: The MIRAGE Study is a multi-center family study of genetic and environmental risk factors for AD. Subjects comprised 691 AD patients (probands) and 973 family members enrolled at 15 research centers between 1996 and 2002. The primary independent and dependent variables were prior NSAID use and AD case status, respectively. We stratified the dataset in order to evaluate whether the association between NSAID use and AD was similar in APOE-ε4 carriers and non-carriers. Ethnicity was similarly examined as an effect modifier. RESULTS: NSAID use was less frequent in cases compared to controls in the overall sample (adjusted OR = 0.64; 95% CI = 0.38–1.05). The benefit of NSAID use appeared more pronounced among APOE-ε4 carriers (adjusted OR = 0.49; 95% CI = 0.24–0.98) compared to non-carriers, although this association was not statistically significant. The pattern of association was similar in Caucasian and African Americans. CONCLUSIONS: NSAID use is inversely associated with AD and may be modified by APOE genotype. Prospective studies and clinical trials of sufficient power to detect effect modification by APOE-ε4 carrier status are needed

    A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease

    Get PDF
    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association studies (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of 185 thousand CAD cases and controls, interrogating 6.7 million common (MAF>0.05) as well as 2.7 million low frequency (0.005<MAF<0.05) variants. In addition to confirmation of most known CAD loci, we identified 10 novel loci, eight additive and two recessive, that contain candidate genes that newly implicate biological processes in vessel walls. We observed intra-locus allelic heterogeneity but little evidence of low frequency variants with larger effects and no evidence of synthetic association. Our analysis provides a comprehensive survey of the fine genetic architecture of CAD showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect siz

    A machine learning pipeline for quantitative phenotype prediction from genotype data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Quantitative phenotypes emerge everywhere in systems biology and biomedicine due to a direct interest for quantitative traits, or to high individual variability that makes hard or impossible to classify samples into distinct categories, often the case with complex common diseases. Machine learning approaches to genotype-phenotype mapping may significantly improve Genome-Wide Association Studies (GWAS) results by explicitly focusing on predictivity and optimal feature selection in a multivariate setting. It is however essential that stringent and well documented Data Analysis Protocols (DAP) are used to control sources of variability and ensure reproducibility of results. We present a genome-to-phenotype pipeline of machine learning modules for quantitative phenotype prediction. The pipeline can be applied for the direct use of whole-genome information in functional studies. As a realistic example, the problem of fitting complex phenotypic traits in heterogeneous stock mice from single nucleotide polymorphims (SNPs) is here considered.</p> <p>Methods</p> <p>The core element in the pipeline is the L1L2 regularization method based on the naïve elastic net. The method gives at the same time a regression model and a dimensionality reduction procedure suitable for correlated features. Model and SNP markers are selected through a DAP originally developed in the MAQC-II collaborative initiative of the U.S. FDA for the identification of clinical biomarkers from microarray data. The L1L2 approach is compared with standard Support Vector Regression (SVR) and with Recursive Jump Monte Carlo Markov Chain (MCMC). Algebraic indicators of stability of partial lists are used for model selection; the final panel of markers is obtained by a procedure at the chromosome scale, termed ’saturation’, to recover SNPs in Linkage Disequilibrium with those selected.</p> <p>Results</p> <p>With respect to both MCMC and SVR, comparable accuracies are obtained by the L1L2 pipeline. Good agreement is also found between SNPs selected by the L1L2 algorithms and candidate loci previously identified by a standard GWAS. The combination of L1L2-based feature selection with a saturation procedure tackles the issue of neglecting highly correlated features that affects many feature selection algorithms.</p> <p>Conclusions</p> <p>The L1L2 pipeline has proven effective in terms of marker selection and prediction accuracy. This study indicates that machine learning techniques may support quantitative phenotype prediction, provided that adequate DAPs are employed to control bias in model selection.</p

    An Open Access Database of Genome-wide Association Results

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of genome-wide association studies (GWAS) is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes. However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results.</p> <p>Methods</p> <p>We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results. Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS.</p> <p>Results</p> <p>Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g., MHC loci) were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g., <it>APOE</it>, <it>LPL</it>). At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (<it>SLC16A7, CSMD1, OAS1</it>), suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings. Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies) containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p < 4.6 × 10<sup>-14</sup>), a finding which was not perturbed by a sensitivity analysis.</p> <p>Conclusion</p> <p>We provide access to a full gene-annotated GWAS database which could be used for further querying, analyses or integration with other genomic information. We make a number of general observations. Of reported associated SNPs, 40% lie within the boundaries of a RefSeq gene and 68% are within 60 kb of one, indicating a bias toward gene-centricity in the findings. We found considerable heterogeneity in information available from GWAS suggesting the wider community could benefit from standardization and centralization of results reporting.</p

    Attenuation of microvascular function in those with cardiovascular disease is similar in patients of Indian Asian and European descent

    Get PDF
    addresses: Institute of Biomedical and Clinical Science, Peninsula Medical School (Exeter), University of Exeter, UK. [email protected]: PMCID: PMC2823616types: Comparative Study; Journal Article; Multicenter Study; Research Support, Non-U.S. Gov't© 2010 Strain et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Indian Asians are at increased risk of cardiovascular death which does not appear to be explained by conventional risk factors. As microvascular disease is also more prevalent in Indian Asians, and as it is thought to play a role in the development of macrovascular disease, we decided to determine whether impaired microcirculation could contribute to this increased cardiovascular risk in Indian Asians
    corecore